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Abstract 

The use of student ratings to measure instructors’ teaching performance and effectiveness in tertiary education 
has been an important but controversial tool in the improvement of teaching quality during the past few decades. 
This is an attempt to explore non-instructional factors of student evaluations by discussing and reviewing 
relevant literature with regard to the most common non-instructional factors in student ratings. Moreover, 
semi-structured interviews were used with 14 college instructors. The findings show that most of the teachers 
support the use of student evaluations as a means of quality control and teaching improvement. However, the 
great majority of teachers expressed their concerns about the non-instructional factors which affect student 
ratings and make them meaningless. They reported that gender, time of evaluation, expected grades, nationality 
of the instructor, and other factors can affect student ratings. The study proposes some recommendations which 
might make student evaluation practices more useful and informative. 

Keywords: non-instructional factors, student evaluations, grading leniency, teaching effectiveness, grade 
inflation 

1. Background of the Study 

1.1 Rationale 

The use of student ratings to measure instructors’ teaching performance and effectiveness has been an important 
but controversial tool in the improvement of teaching quality during the past few decades (Spooren et ah, 2007). 
However, most tertiary institutions rely on student ratings as an indicator of faculty teaching performance 
(Benton, 2011, p. 41). Empirical research has shown that there is a significant correlation between student 
evaluations of teaching (SETs) and student grades (Isely & Singh, 2005, p. 29). Student evaluations have a 
negative and a pernicious effect on teaching if they are not adjusted appropriately and accurately to measure 
teaching effectiveness (Arreola, 2007; Schneider, 2013). If student evaluations can be increased by giving higher 
grades, then they are a flawed instrument for evaluating teaching. Moreover, this process may contribute to the 
inflation of grades in higher education institutions if faculty members have an incentive to improve their 
evaluations and attract students (Krautmann & Sander, 1999). The findings of Krautmann and Sander (1999) 
indicated that grades affect student evaluations, and faculty have the ability to “buy” higher evaluations by 
lowering their grading standards (p. 61). Schneider (2013) claims that faculty at higher education institutions that 
place significant weight on student evaluations often report that they give out easy grades, avoid controversial 
material, and dumb down courses in order to get higher student evaluations (p. 122). He criticises student 
evaluations as a practice poisonous to the teaching environment, inaccurate, easily manipulated, and lacking 
psychometric sense, reliability, and validity. Furthermore, D’Apollonia and Abrami (1997, cited in Germain & 
Scanduran, n.d) claim that student evaluations are unsophisticated and provide little insight for teaching 
improvement; they are, in their view, only crude judgements of instructional effectiveness. All these are 
considered to be disadvantages of student evaluations aside from their impact on grade inflation. However, it is 
important to keep in mind when judging student evaluations that just because “good teaching is hard to measure 
doesn’t mean that we should give up trying to assess it” (pp. 124-128). Some scholars, such as Arreola (2007), 
claim that student evaluations can offer slightly valid and reliable assessments of teaching if they are “properly 
constructed, appropriately administered, and correctly interpreted” (p. 98). If the purpose of student evaluations 
is to improve and reward the quality of teaching, it is important for higher education institutions to develop a 
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system for evaluating teaching that emphasizes the amount students learn and the amount of work they do in a 
course (Schneider, 2013, p. 122). 

The problem explored in this study stems from the fact that student evaluations have gained controversy among 
teachers and practitioners. Students need to separate the quality of instruction in their evaluations from the grade 
they received or expect to receive. A substantial number of teachers have expressed their fear and concerns about 
these evaluation forms containing non-instmctional elements which may not measure teaching effectiveness or 
help to improve teachers’ pedagogical and instructional skills. Critics believe that students, especially freshmen, 
cannot judge any aspect of teaching (Trout 1997, cited in Al-Issa & Suleiman, 2007, p. 303). Students do not 
have the knowledge or experience to evaluate the multidimensionality of teaching. Additionally, they argue that 
student evaluations of teachers are often influenced by non-instmctional factors, and data obtained from student 
ratings should not be used by those making decisions about teachers (Crumbley, Henry, & Kratchman, 2001; 
Emery et al., 2003, cited in Al-lssa & Suleiman, 2007, p. 304). Ewing claims that student ratings are positively 
correlated with students’ expected grades (2012, p. 141). Therefore, this paper is an attempt to explore and 
identify these non-instmctional factors according to teachers’ perspectives, and to discover some possible ways 
of eliminating these factors and improving current student evaluations and evaluation practices. 

1.2 Significance 

This study is of utmost importance due to the fact that student evaluations are extremely important in the 
teaching profession and can be used for making significant decisions about improving teaching quality, as well 
as determining the promotion, contract renewal, and salary increases of teachers. Moreover, student evaluation of 
teaching (SET) has been used for pedagogical development and administrative purposes, quality monitoring and 
control, and making decisions on promotions and tenure (Rantanen, 2013, p. 224). So the findings from this 
study will be of a great value to practitioners, teachers, decision makers, higher educational institutions, and 
quality assurance officers who wish to improve their student evaluation mechanisms and control evaluation 
quality. The findings are expected to inspire the future design of effective student evaluation systems that 
stimulate the professional development of teachers and improve teaching quality. 

1.3 Research Questions 

1. What are the college instructors’ views about the use of student evaluation forms to evaluate teaching 
effectiveness? 

2. What are the non-instmctional factors in student evaluations of their teachers? 

3. How could student evaluations be improved to meet instructional needs and improve teaching practices? 

2. Theoretical Background 

2.1 Effective Teaching Fallacy 

Effective teaching is very difficult to measure because of its multidimensionality and complexity, but this does 
not mean we should give up trying to assess and measure it (Schneider, 2013, p. 128). Biggs and Tang (2007) 
argue that “effective teaching requires that we eliminate those aspects of our teaching that encourage surface 
approaches to learning and that we set properly so that students can more readily use deep approaches to learning” 
(p. 31). Moreover, student evaluation forms give students the opportunity to rate their instructors at the end of 
each semester in terms of teaching effectiveness, knowledge about the subject, clarity of course objectives, and 
effectiveness of delivery (Holmes & Smith, 2003, p. 318). A growing body of relevant literature stresses that 
student evaluation of teaching as an important source of data in evaluating teaching quality. “Good teaching and 
good learning are linked through students’ experiences of what we do. It follows that we cannot teach better 
unless we are able to see what we are doing from their point of view” (Ramsden, 2003, p. 84). Good teaching is 
difficult to measure (Schneider, 2013, p. 128). It is difficult to define effective teaching due to its 
multidimensionality, complexity, and variability. Researchers have made several attempts to define teaching, 
considering all factors which are related to different theories underpinning learning and teaching and which 
inform and guide teaching effectiveness. A number of researchers (such as Adam, 1997; Brown, 1996; Marsh & 
Dunkin, 1992; North, 1999; Patrick & Smart, 1998, cited in Al-Hinai, 2011) confirm that teaching is 
multidimensional and complex, and therefore, it is difficult to construct a one-fits-all definition of effective 
teaching. Moreover, Centra (1993) argues that student learning at the tertiary level is a complex activity and can 
be affected by several factors besides teaching effectiveness, such as student learning styles, level of motivation, 
aptitudes, student effort, and preferences for teaching styles. Furthermore, effective teaching can be shaped and 
based on different learning theories and the ways in which they view effective learning and teaching (Grasha, 
1993, cited in Al-Hinai, 2011). However, some researchers point out that the definition of effective teaching 
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should be focused and based on student learning as an important indicator of effective and good teaching 
(Abrami, D’Apollonia, & Rosenftled, 1997; Ellett & Teddlie, 2003; Hativa, 2000; Mckeachie, 1997; Ramsden, 
1992, 2003; Seldin, 1998, 1999, cited in A1 Hinai, 2011). Additionally, Arreola (2007) argues that a complete 
definition of good college teaching should include three main dimensions: content expertise, instructional 
delivery skills and characteristics, and instructional design skills. Student learning is considered the component 
of the definition of effective teaching. Moreover, effective teaching at the tertiary level may not necessarily be 
the same as effective teaching at the basic or elementary school or undergraduate level, and teachers who do well 
at one level may not do well at another. Therefore, teacher evaluation forms should consider all these aspects and 
particularly effective teaching, and should be used for both developmental and evaluative purposes. Besides, 
Al-Hinai (2011) points out that one of the underpinning themes of effective teaching is good student learning and 
this has increasingly become an important factor in the overall evaluation of teachers in many parts of the world. 

2.2 Importance of Student Evaluations of Teaching 

Student ratings are of great importance to professors (Hobson & Talbot, 2001; Lindahl & Unger, 2010, p. 71). 
Countries like Portugal, New Zealand, and the USA have implemented student evaluations of their teachers in 
order to improve teaching quality (Delvaux et ah, 2013, p. 1). Several scholars claim that student ratings are 
important for quality control in the teaching-learning process. They consider student evaluations an important 
element in quality improvement for quality management applications and student satisfaction, and one of the 
pillars of the quality process (Baraktar et ah, 2008; Harvey et ah, 1997; Harvey, 2003; Houston, et ah, 2008; 
Kanji, Malek, & Wallace, 1999; Williams & Cappuccini-Ansfield, 2007, cited in Zineldin, Akdag, & Vasicheva, 

2011, p. 231). Student evaluations should be formative and summative because formative evaluation can 
stimulate teachers’ professional development, while summative evaluation can hold teachers accountable for 
their performance quality (Delvaux, et ah, 2013, p. 1). Moreover, student ratings of teacher performance and 
teaching effectiveness are generally taken as an important measure of teaching effectiveness because they can 
give both quantitative and qualitative evidence of an instructor’s effectiveness. However, they often misrepresent 
classroom realities (Beyers, 2008, p. 102). Researchers believe that “student ratings are the most valid source of 
evaluating teaching effectiveness, and there is little support for the validity of any other source” (Zhao & Gallant, 

2012, p. 227). Aleamoni supports the use of student ratings for five reasons: 1) student ratings can provide 
information about the accomplishment of major educational objectives; 2) they can provide information about 
the rapport between the students and their teacher; 3) they can provide information about the elements of a 
classroom, such as quality of instructional materials, homework, and instructional methods; 4) they can provide 
information about the kind of communication that exists between students and the instructor; and 5) they can 
also provide consumer data and the freedom for students to choose their instructors (1999, cited in Zhao & 
Gallant, 2012, p. 227). Furthermore, Machina (1987) acknowledged the importance of student ratings in the 
teaching-learning process, as well as the way in which student evaluations can honestly report student 
perceptions about the course and quality of instruction. But some researchers consider student ratings as 
“meaningless quantification” which leads to personality contests instead of measuring teaching effectiveness 
(Haskell, 1997; Neath, 1996; Sproule, 2002, and Spooren, Mortelmans, & Denekens, 2007, p. 668). Additionally, 
student evaluations of teachers can create a competitive climate among faculty members within colleges and 
departments (Obenchain et ah, 2001, p.100). However, student evaluations were formally used in higher 
education for formative evaluation purposes was formally used in 1970 (Onwuegbuzie et ah, 2007). Although 
there are several arguments against SETs, there are also many arguments in favour of SETs, as identified by 
Murray (1987). SETs can provide feedback and guidelines for teachers for further improvement. They can help 
in assessing teacher performance, informing teacher training, helping students select instructors and courses, 
enhancing the professional status of teachers, promoting the accountability of educational institutions, and 
controlling and assuring quality in teaching and learning. They can also be used for administrative purposes, 
such as determining faculty tenure, promotion, and salaries. Supporters of the use of SETs state that “evaluative 
judgements on a regular basis have strong positive impact on the improvement of instructional skills” (Spooren, 
et ah, 2007, p. 667). However, they claim that evaluations can only be useful if the students who are being taught 
are trained in using these kinds of forms (Spooren, et ah, 2007, p. 667). Additionally, the key purpose of student 
evaluations of teaching is providing feedback that leads to better teaching and learning (Marsh, 1987). However, 
measuring teaching effectiveness and teacher’s instructional skills using a single-item level questionnaire 
presents serious methodological problems because teaching quality is not something that is directly observable 
(Spooren et ah, 2007, p. 669). Moreover, in order to make student evaluation meaningful, the performance 
criteria should be stipulated in an individual job description for each teacher that should function as a basis of the 
evaluation system (Delvaux et ah, 2013, p. 2). 
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2.3 Danger of Student Evaluations of Teaching 

Student ratings cannot be considered an effective indicator of teaching quality for the following reasons: they are 
of questionable validity; they do not seem to measure student learning; and they tend to lead faculty to inflate 
their students’ grades and reduce course content (Schneider, 2013, p. 127). SETs are contributing to a 
problematic teaching environment (Schneider, 2013, p. 125). Schneider reports that universities’ faculty 
members place significant weight on student evaluations and often report that they usually inflate their student 
grades to avoid teaching controversial materials in order to achieve better evaluations from their students (2013, 
p. 122). Student evaluations of teaching have been criticised for measuring student satisfaction instead of the 
quality of instruction because they do not take into account other factors which may affect the course being 
taught, such as the grades given (Zimmerman, 2002, cited in Crumbley, Flinn, & Reichelt, 2010, p. 188). 
Scholars argue that seeking students’ feedback on the teaching effectiveness of their teachers could be a threat to 
academic freedom (Haskell, 1997). There are several arguments against the use of SETs. Some researchers 
believe that SETs are an inappropriate measure of teaching effectiveness because students are immature and lack 
the experience and expertise to judge and evaluate their teachers’ performance. Additionally, SETs are affected 
by many situational factors which are irrelevant to teaching. Furthermore, SETs may be harmful to academic 
quality and standards. SETs usually contain items which are ambiguous, vague, and subjective. Finally the 
validity and reliability of SETs are questioned by many practitioners, teachers, and researchers as having little to 
do with learning (Kwan, 2000; Al-Hinai, 2011; Emery et ah, 2003). In order for teachers to get higher ratings, 
they can simply inflate their student grades, simplify examinations, and lighten the workload in their courses and 
assignments. Sacks (1996, cited in Al-Hinai, 2011) argues that SETs make instructors and students manipulate 
each other for grades and higher ratings. Moreover, some instructors have admitted to relaxing grading standards 
and reducing student workload, ultimately inflating grades to receive higher evaluations (Beyers, 2008, p. 105). 

2.4 Critical Issues in Student Evaluations of Teaching 

The prime purpose of student ratings of teachers is threefold: helping instructors, mapping the quality of teaching 
in tertiary education institutions, and providing information that could help instructors improve their teaching 
(Kulik, 2001). Several higher education institutes rely heavily on commercial forms, which are all based on the 
same design principles, to evaluate their teachers. Centra (1993) claims that the most widely used commercial 
forms are the Student Instructional Report (SIR) which was developed in the 1970s, the Instructional 
Development Effectiveness Assessment form (IDEA) which was developed by Kansan University, and Marsh’s 
(1992) Students’ Evaluations of Educational Quality (SEEQ). All these forms can be integrated and combined to 
make well-structured rating forms. Researchers such as Centra (1993), Braskamp and Ory (1994), and Marsh 
(1984) identify a number of factors and dimensions which should be combined in SET rating forms. These are: 1) 
course organisation and planning, 2) clarity and communication skills, 3) the teacher’s enthusiasm for the subject, 
4) teacher-student interaction (group interaction and individual rapport), 5) course difficulty, workload, and 
breadth of coverage, 6) grading, examination, and assignments, and 7) student-self-rated learning. Some 
researchers have grouped student rating items into 28 dimensions and others identify six to nine factors 
commonly used in student evaluation forms. Many SET rating forms, including SEEQ, include one or two 
“global” rating items, such as an instructor item which asks students to rate their teacher’s overall performance 
in classroom, and a course item which is intended to elicit student rating of their experience in the course as a 
whole (Cashin, Downey, & Sixbury, 1994, cited in Al-Hinai, 2011). 

Moreover, there are some other critical issues which have been debated in the literature, such as the reliability, 
internal consistency, and validity of SETs. The reliability of SETs has been investigated in relation to their 
stability across time, courses, and instructors (Young, Delli, and Johnson, 1999). The findings were contradictory. 
The internal consistency of these SET forms, which is represented in the agreement among items, was criticised 
in forms based on poorly designed rating scales, poorly worded or inappropriate items, and items that have not 
been subjected to proper psychometric testing, such as reliability or factor analysis. These kinds of evaluation 
forms will not provide useful information (Marsh, 2007). Research has shown that the internal consistency 
reliability of SEEQ forms ranged from .88 to .97 (Marsh, 1982). Lack of agreement between different students’ 
evaluations of the same teacher is usually caused by using different students’ ratings rather than by the 
inconsistency of ratings by individual students. Moreover, coefficient alpha does not yield an adequate basis for 
measuring the reliability of SET instruments (Marsh, 2007). The validity of SETs is centred on the degree to 
which student evaluations of teaching performance in a classroom setting reflect actual teaching performance 
exhibited by a faculty member (Young et al., 1999, p. 181). Do student evaluations really measure teaching 
effectiveness? Researchers believe that the main reason behind the difficulty in establishing the validity of SETs 
is the lack of agreement upon what effective teaching is. There is no universally accepted definition of effective 
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teaching (Cashin, 1995; Cohen, 1981; Elton, 1984; Goodwin & Stevens, 1993; Marsh, 1987, 2007, cited in 
Al-Hinai, 2011). Moreover, Solomon et al. (1997) stress that establishing reliability in student ratings is a critical 
matter because decisions on the quality of teaching are often made on a limited number of student evaluations. 
Unreliable evaluations and ratings can potentially have a significant impact on a teacher’s career (Rantanen, 
2013, p. 225). Therefore, it could be argued that student ratings should not be used as the sole source for making 
decisions with regard to a teacher’s performance and teaching effectiveness. Although teaching evaluation may 
include peer evaluations, retrospective evaluations by alumni, and self-evaluations, most universities tend to rely 
heavily on student ratings when attempting to quantify an instructor’s teaching effectiveness (Hobson & Talbot, 
2001, p. 26). Researchers admit that “well-developed student evaluations with adequate reliability and validity 
data may provide some of the best measures of teaching effectiveness” (Hobson & Talbot, 2001, p. 30). 
Instmment problems such as ambiguous items, positively and negatively skewed items, and items that have no 
relation with classroom teaching performance can affect the validity of ratings (Obenchain, Abernathy, & Wiest, 

2001, p. 100). 

2.5 Biasing Factors in Student Evaluations of Teaching 

Student ratings tend to be somewhat biased in certain key areas if they are not adjusted appropriately to reflect 
teaching effectiveness adequately (Schneider, 2013, p. 125). Student evaluations of their teachers are often 
influenced by many biasing factors which have nothing to do with teaching. Research has increasingly cautioned 
against the use of SET information as the sole source of teacher evaluation data and encouraged the use of 
different evaluation tools, particularly for making personal decisions, to reflect the multidimensionality of the 
teaching process. One of the critical factors that influences student evaluation is the students’ previous 
relationships and rapport with instructors or the popularity of instructors with students on campus (Germain & 
Scandura, 2005, p. 63). If students have had the instructor in previous courses and earned a good or bad grade, it 
will affect their current ratings. 

Therefore, due to the students’ pre-existing relationships and knowledge about the instructor, their evaluations 
would be more biased than those of staff members who do not have previous knowledge. Moreover, 
demographic factors such as gender, race, ethnicity, and age may affect student ratings and evaluations. Kobliz 
(1993) reviewed several studies on this issue and he found that male students rate women instructors more 
harshly than female students. He added that race could be an influential factor. Minority students may be lenient 
with minority instructors in their ratings. Additionally, the age of students may affect the way in which a staff 
member is perceived by them. Younger students may be more willing to give good ratings to young faculty 
members (Kobliz, 1993, p. 63). Finally, the socioeconomic status and cultural background of the students can 
also affect their evaluations of their instructors. Moreover, research has shown that students usually give better 
and higher ratings for faculty who are less demanding and assign less work and course content (Al-Hinai, 2011). 
Students’ attitudes about the rating process of their teachers are strongly affected by the context of evaluation 
(place and time) and other contextual factors. Furthermore, the length of the form, the ambiguity of the questions, 
and the unclear aims of the evaluation are considered among the factors which may influence the students’ 
attitudes toward the evaluation process. Students usually use their own perceptions of what constitutes “good” or 
“bad” teachers to define effective teaching (Al-Hinai, 2011). Students’ individual conceptions about good 
teaching affect their ratings and evaluations of their teachers. Furthermore, there are internal and external biasing 
factors in SETs. External biases result from differences in teaching situations over which the teacher has a little 
or no control, such as student and course characteristics, whereas internal biases revolve around student attitudes 
and perceptions about the teacher and the course, and can impact their ratings in general (Broder & Dorfman, 
1994; Young et ah, 1999). Moreover, Braskamp and Ory (1994, cited in Al-Hinai, 2011, p. 102) reported that 
research findings summarise factors which influence student ratings of their teachers in four categories: course 
characteristics, student characteristics, instructor characteristics, and administrative procedures and rating 
instrumentation. Firstly, course characteristics include whether the course is elective or compulsory because 
students tend to rate elective courses higher. Course ratings in higher level courses tend to be higher. Class size is 
also relevant, as smaller classes tend to receive higher ratings. Courses in humanities and the arts tend to receive 
higher ratings, while the social sciences receive lower ratings, and mathematics and sciences receive the lowest 
ratings of all. Finally, class time and workload can affect student ratings, and challenging courses tend to receive 
higher ratings. Secondly, student characteristics are centred on gender, expected results, grade point average 
(GPA), type of degree programme (major or minor), prior interest in the subject, language of instruction in high 
school, and personality. As for gender, students tend to rate same-sex instructors higher. Students expecting 
higher grades tend to give higher ratings. Additionally, students with higher GPAs are more likely to give higher 
ratings. Whether students are registered in a major or minor programme can also affect student ratings: majors 
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tend to give higher ratings than minors. Students with prior interest in the subject or the course tend to rate their 
instructors higher than those who have no prior interest. Students whose language of instruction in high school is 
not English tend to be biased by the age, gender, nationality, and personality of their teacher. Thirdly, instructor 
characteristics involve the instructor’s rank, gender, teaching experience, personality, nationality, and research 
productivity. As for instructor’s rank, gender, and teaching experience, there are no consistent, positive, or 
significant relationships in research findings. The personality of the instructor can affect student ratings. Warm, 
enthusiastic, and friendly instructors are generally correlated with high ratings. The instructor’s nationality can 
moderately affect student ratings. Finally, instructors’ research productivity was correlated positively with 
student ratings. Fourthly, administrative procedures and rating instrumentation include the timing of evaluation, 
student anonymity, the presence of the teacher in the classroom, the stated purpose of evaluation, the placement 
of items, and the negative wording of items. The timing of evaluation is important in student ratings; research 
has shown that lower ratings are generally correlated with ratings administered during the final exam. As for 
student anonymity, students tend to give higher ratings when asked to identify themselves. The presence of the 
instructor in the classroom during the evaluation process can affect student ratings. Students give higher ratings 
when their teacher is present in the classroom. Moreover, students usually give higher ratings if the stated 
purpose of the rating is promotion or tenure. Placing specific items before or after another in the evaluation form 
has no significant effect on student ratings. Finally, the negative wording of items has no significant influence on 
student ratings (Al-FIinai, 2011, pp. 102-103). Furthermore, students can use their evaluations as a tool for 
punishing teachers for low grades (Cntmbley et al., 2001). Teachers’ behavioural traits, such as the likeability 
factor, could also be considered non-instructional factors which could affect student ratings (Abrami et al., 1982; 
Cardy & Dobbins, 1986; Feldman, 1986; Jackson et al., 1999; Naflulin et al., 1973; Williams and Ceci, 1997, 
cited in Pounder, 2007, p. 180). To sum up, these are the most common non-instructional factors which can 
affect student evaluations of their teachers, and they can be effectively grouped under student-related factors, 
course-related factors, and teacher-related factors (Pounder, 2007, pp. 179-186). 

2.6 Characteristics of Effective Student Evaluations of Teaching 

Most student ratings are used either for necessary diagnostic information or to provide evidence for decision 
making (Avalos & Assael, 2006, p. 257). Features of a good teacher evaluation system should reflect the 
following things: clarity of the purposes and criteria of the evaluation system, perceived fairness and accuracy of 
the evaluation system, teacher satisfaction with their performance and the evaluation process such as the 
credibility of the evaluators, the relationship between the evaluator and the teachers, and the utility of the 
feedback (Delvaux, et al., 2013, p. 3). However, there are some suggestions in the literature which could make 
student evaluation more focused and reliable. Firstly, students should be trained in how to rate their instructors to 
reduce halo effects, leniency, and psychometric errors in student evaluations of their instructors’ performance, 
and students also need to be encouraged to separate the quality of instruction from the grade they expect to 
receive in class. Secondly, balancing and weighting items within the form could affect rating. Thirdly, involving 
faculty in creating evaluation forms may reduce skepticism and improve the reliability of the forms. Finally, 
some researchers have suggested replacing student evaluation forms with teaching portfolios which could be 
updated and used annually (Cook, 1989; Marsh, 1993, cited in Germain, 2005, p. 61). 

However, some scholars, such as Arreola (2007), claim that student evaluations can offer slightly valid and 
reliable assessments of teaching if “properly constructed, appropriately administered, and correctly interpreted” 
(p. 98). He expressed his concerns about homemade student evaluation forms not only because of their low and 
dubious quality, but also because of the lack of universally accepted definitions regarding the necessary 
characteristics of good teaching and teaching excellence, and the lack of understanding about the psychometric 
analysis that underlies student evaluations of their teachers (Arreola, 2007). Marsh & Roche (2000) point out 
that the most effective ways for faculty members to earn high student evaluations are to offer students 
demanding and challenging learning materials, to help them master the materials, and to encourage them to value 
and appreciate their learning (p. 226). Furthermore, Centra (2003) believe that small classes with fewer than 15 
students get higher evaluations than do larger classes (p. 498). In addition, students sometimes use these 
evaluations to threaten their instructors by giving those low scores and complaining about their instructors’ 
teaching effectiveness to intimidate and force them to accept late assignments, sloppy work, and all forms of 
excuses (Shapiro, 2002, cited in Crumbley et al., 2010, p. 188). Therefore, Ciumbley et al. (2010) recommend 
that instructors should be punished for the unethical teaching techniques they use to “cook” their student 
evaluation scores by inflating grades. Appropriate actions must be taken by administrators against all faculty 
members who inflate their students’ grades or decrease the course materials in order to escape low scores in their 
student evaluations (Crumbley et al., 2010). Student evaluations of teaching forms are usually used for 
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administrative control but this has caused grade inflation, coursework deflation, and, as a result, decreases in 
what students are actually taught (Crumbley et al., 2010, p. 187). The relationship between student evaluation of 
teaching and expected grades has been a controversial issue within the literature (Isely & Singh, 2005, p. 29), but 
most higher education institutions in the United States rely heavily on student evaluations to award tenure, 
promotions, and salary increments (Ellis, Burke, Lomire, & McCormack, 2003, p. 36). However, research has 
shown that student evaluations of their teachers is considered to be the main factor correlated with grade 
inflation, but the correlation between high grades and high student evaluations remains controversial (Abbott, 
2008, p. 33). Many scholars have expressed their concerns and fears about a use of student evaluations to 
measure instructor’s teaching performance which may reward faculty members who routinely give the highest 
grades to their students (Carney, Isakson, & Ellsworth, 1978; Hocking, 1976; and Winsor, 1977, cited in Ellis et 
al., 2003, p. 35). Moreover, a positive correlation has frequently been found between the grades students receive 
and the ratings they give to their instructors (Aigner & Thum, 1986; Anikeef, 1953; Bausell & Magoon, 1972; 
Brown, 1976; Chacko, 1983; Ditts, 1983; Doyle & Whitely, 1974; DuCette & Kenney, 1982; Genehzadeh, 1988; 
Greenwald & Gillmore, 1997; Hockings, 1976; Kau & Rubin, 1976; Kennedy, 1975; Krautmann & Sander, 1999; 
Mehdizadah, 1990; Nelson & Lynch, 1984; Nichols & Soper, 1972; Remmers, 1993; Riley, Ryan, & Lifshitz, 
1950; Scwab, 1975; South, Hill, & Marrison, 1979; Stumpf & Freeddman, 1979, cited in Ellis., 2003). All these 
findings are consistent with the hypothesis that “instructors can ‘buy’ better evaluation through more lenient 
grading” (Krautmann & Sander, 1999, p. 59). All these studies have shown that there is a significant relationship 
between the grade given (or anticipated by students) in a particular course and the ratings given by students to 
the instructor who taught the course. Student evaluation of their teachers has been viewed as a double-edge 
sword because instructors who are strict graders, particularly those who give As, are being penalised for their 
rigour by the average student evaluations they receive (Ellis et al., 2003, p. 39). Moreover, recent research has 
shown that teachers who are strict graders facilitate student learning more than do instructors who grade 
leniently (Bonesronning, 1999, cited in Ellis et al., 2003).If instructors who use rigorous and reliable grading 
standards to foster their students’ learning are being penalised in their course evaluations, then good teachers are 
being punished for good teaching. 

3. Methodology 

This study is qualitative which aims to explore teachers’ views about student evaluations, non-instmctional 
factors in student ratings, and some possible strategies for improving student ratings that can inform and shape 
quality teaching. To achieve the objectives of this study, a considerable number of studies on student evaluations 
were reviewed and discussed. An interview was conducted with 14 teachers to collect their opinions and answer 
the study questions. Teachers are Omanis, British, Americans, Sudanese, etc. Most of them have been teaching in 
Oman for a couple of years. An interview comprised of six open-ended questions was carried out with 14 
instructors in three departments in a public college in the sultanate of Oman. The teachers were chosen on their 
availability and for other practical reasons. The interviews included the following questions: Do you support the 
idea of using student evaluation forms to evaluate teaching effectiveness? If yes or no, why? In your opinion, 
what are the non-instructional factors that affect student evaluations of their teachers? How could student 
evaluations be improved to meet instructional needs and improve teaching practices? The data were analysed 
using themes that emerged from the interviews to answer the study questions. The data was analysed 
qualitatively using themes emerged from teachers’ responses. 

4. Data Analysis and Discussion 

This part of the article comprises data collected via interviews from 14 college instructors and comparison. 
Semi-structured interviews were conducted with 14 college instructors. The interview questions were centred on 
instructors’ views about the use of teacher evaluation forms, the criteria against which student evaluation forms 
should be developed, the non-instmctional factors in student evaluations, and the ways in which student 
evaluation forms could be improved to meet instructional needs and improve teaching practices. 

4.1 Research Question 1: What are the College Instructors ’ Views about the Use of Student Evaluation Forms to 
Evaluate Teaching Effectiveness? 

In response to this question a selection of comments from individual teachers appears below: 

“the idea of using student evaluation forms because they can be a good indicator of teaching effectiveness if used 
properly and objectively.” They believe that evaluations can help teachers to be aware of their weaknesses and 
strengths. 

One teacher said: “Yes, I do support teaching evaluations because they can serve as an important tool to assess 
quality of teaching.” 


87 




www.ccsenet.org/hes 


Higher Education Studies 


Vol. 3, No. 5; 2013 


Another teacher said: “Yes, I really support it. I think that since students are at the centre of the learning and 
teaching process, they must have a say in evaluating their teachers.” 

Another teacher reported: Students’ forms can be used to evaluate teaching effectiveness. But the “administration 
of the university should make sure that students are trained and know how to fill in such forms. They need to 
make sure those students’ answers are valid and reliable. They should give the same results whether they are 
administered before the exam or after the exam, before knowing the results or after knowing the results. If 
student ratings of their teachers are affected by the results, this means that such forms are neither valid nor 
reliable.” 

Another teacher said: “Yes, I support that. Teachers need to reflect on their teaching practices using their students’ 
feedback. I think it’s one of the ways that teachers can improve their teaching methods to be more effective.” 

Yet another teacher reported: “Yes, if learning took place, then the teaching was effective. Learners provide 
teachers with a valuable source of feedback regarding their teaching. They (students) know better if the syllabus 
objectives were fulfilled or teaching strategies were effective. All of this can be observed in students’ 
performance.” 

Another teacher said: “Yes, the idea is good, but what I found is that most of students do not read the form and 
randomly tick the options. On the other hand, some are biased because they were looking for some kind of 
favouritism from these teachers in the past”. 

And another teacher reported: “Students like high marks and passing the course easily. Accordingly, any 
instructor regardless of his or her gender, race, and political orientation can be rated highly if he or she gives 
high marks. After grades, students look for friendly teachers who give fewer assignments, exempt them from 
doing assignments, accept late arrivals, forgive not attending classes, and listen to them during office hours. 
Some students practice discrimination against teachers based on their religion, race, origin, and physical 
appearance.” 

Based on the teachers’ responses with regard to their views about the use of student evaluations, such evaluations 
can evaluate teaching effectiveness. This is supported in the literature by many scholars who believe that student 
evaluations are an important element in quality improvement for quality management applications and student 
satisfaction, as well as one of the pillars of the quality process (Baraktar et al., 2008; Harvey et al., 1997; Harvey, 
2003; Houston, et al., 2008; Kanji, Malek, & Wallace, 1999; Williams & Cappuccini-Ansfield, 2007, cited in 
Zineldin et al., 2011, p. 231). 

4.2 What are the Non-Instructional Factors in Student Evaluations of Their Teachers? 

There are many non-instructional factors in student evaluations of their teachers which have been discussed in 
the literature, and which are related to students, teachers, or the course. One teacher said: “Although type and 
level of course, instruction [gender and rank], and environment (semester, time of day, and duration) impact the 
non-instructional factors in teaching evaluation, my belief is that it is coloured by the attitudes of the students... 
In some classes all students are highly motivated; here, student evaluation ratings of learning outcomes are 
almost always favourable. On the other side, however, are students who take a class only because it meets a 
requirement or because it was offered at a favourable time.” This is supported with findings reported by Hobson 
and Talbot (2001) and Al-Issa & Sulieman (2007). Moreover, another teacher described the non-instructional 
factors as “dislike for teachers, lack of interest in the course, unwillingness to give a productive feedback, grades, 
and, most importantly, comparison between teachers.” Yet another teacher said: “In my opinion some of these 
factors are imposed on students, not their choice. So they have to use them. For example, the time of the 
year/day at which evaluations take place is not the students’ choice. I think these factors can be controlled by the 
administration.” It could be argued that the time of giving out student evaluation forms is very significant in the 
evaluation process in general. Whether the forms are given at the end of the semester or in the middle or after the 
result or before the result can affect student ratings. Another teacher believed that “grades, gender of the 
instructor, political orientation, race, being rural or urban, physical appearance, and leniency (being too friendly 
with students)” are considered the most common non-instructional factors in student evaluations. This is 
consistent with relevant literature which stresses the significance of teacher-related factors as major in affecting 
student ratings. Again, a teacher said that there are many non-instructional factors that could affect student 
evaluations, such as “time of the year during which evaluation takes place, gender boy/girl, online/in class, and 
students’ attitude toward the teacher [likes/dislikes]”. 
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4.3 How Could Student Evaluations be Improved to Meet Instructional Needs and Improve Teaching Practices? 

There are several potential strategies and suggestions from both relevant literature and teachers which can be 
incorporated to improve student evaluation practices to meet their purposes and improve teaching quality and 
student learning. One teacher said: “1 think this needs continuous review by teachers/administrators. Teachers 
can conduct regular surveys to assess their satisfaction with these evaluations, and then revisions and 
improvements can be done accordingly.” It could be argued that involving students in preparing and reviewing 
student evaluation forms could help in improving the evaluation process. 

Another teacher argued that “modifying the questions in the evaluation form helps gather the same information 
in many different ways. This helps the instructor know for sure if the students are biased or being objective.” Yet 
another teacher said: “student evaluations should be done by someone other than the instructor of the course, 
student personal information should be presented or required, and the form should be given out at a suitable time 
- not before an exam, for example.” The relevant literature admits that the presence of the instructor during the 
evaluation process could endanger and even affect the student ratings; therefore, student evaluations should be 
done by an independent teacher and at a suitable time, as both of these factors can affect the evaluation process 
negatively so that it may not yield valid results or ratings. Another teacher expressed his suggestions that 
“teacher evaluation forms must be specific and should concentrate on areas you can improve; course content 
evaluation should be kept separate from teaching styles; students should be clearly informed about the meaning 
of numerical ratings of different categories such as strongly agree, agree, neutral, disagree, and strongly disagree; 
and the evaluation process should be made anonymous to avoid fear of grade influence.” Training students in 
teacher evaluation and making them aware of the consequences of the evaluation and the evaluation scales are of 
utmost importance because such procedures can guarantee reliable and valid evaluations. This is supported by 
another respondent: “students need to know that their feedback will be used for positive change, and they also 
need to be trained in how to give appropriate feedback.” Another teacher said: “student evaluations should be 
made easy and systematic, and we need to avoid features which may have biased answers.” Yet another teacher 
suggested that in order to make student ratings more meaningful, we need to do the following: “student 
evaluation forms should be developed by teachers who are in the field, not by outsiders, and they should be 
based on clear criteria and the goal of evaluation should be to improve teaching, not to point out teachers’ 
mistakes.” It is quite obvious that teachers are not in favour of commercial forms and they support the idea of 
having domestic forms developed by the teachers themselves. Another teacher expressed his suggestions: “the 
questions should be very clear to students, the questions should be objective, the students should be given ample 
time and freedom to give their opinions frankly, and these evaluations should be for the improvement of the 
syllabus and teaching methods, but shouldn’t affect teachers’ evaluation in any way; teachers can be evaluated 
using other means, and forms should be given to students in both English and Arabic languages.” Another 
teacher believed that “there are certain areas in which students have limited qualifications to give faculty 
feedback, such as teaching methods, content covered up to date, motivational methods, the importance of 
assignments, and real world applications. So training is necessary. Because the students are young, 
questionnaires must be provided over the internet, as they will be more likely to complete questionnaires from 
home.” 

5. Conclusion and Recommendations 

This study attempts to explore non-instructional factors that affect student evaluations and teachers’ perspectives 
and views about student evaluations of their teachers’ performance and teaching effectiveness at a public college 
in Oman. Fourteen teachers were interviewed to answer the following study questions: What are the college 
instructors’ views about the use of student evaluation forms to evaluate teaching effectiveness? What are the 
non-instructional factors in student evaluations of their teachers? Flow could student evaluations be improved to 
meet instructional needs and improve teaching practices? The findings from the qualitative data revealed that the 
vast majority of teachers support the idea of using student ratings if they are used appropriately and objectively. 
Moreover, non-instructional factors related to the evaluation, such as time of the evaluation, place of evaluation, 
and mode of evaluation are the most reported by teachers. Factors such as being friendly, rank, gender, political 
orientation, presence of the teacher during evaluation, leniency in grading, race, and nationality are the most 
common teacher-related factors cited by the teachers. Based on the findings and the teachers’ suggestions, the 
study recommends the following strategies for improving student evaluations and teaching quality. Firstly, 
special consideration should be given to the non-instructional factors which can affect student evaluations and 
should be controlled. Students should be made aware of the purpose of evaluation and the consequences of their 
ratings on the instructor’s career. Students should be trained in how to complete these evaluation forms and the 
forms should be administered before the exams to avoid the impact of students’ results on their ratings. 
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Furthermore, student evaluations should be bilingual to avoid any confusion. In addition, students should be 
made aware of what constitutes effective teaching in order to make their ratings more meaningful and helpful in 
improving the teaching-learning process. The number of items within the form should reflect all the dimensions 
of effective teaching. This study has several limitations in terms of the methodology used (interviews only), the 
involvement of teachers only rather than students as well, and the collection of data from one college only. Had 
the perspective been expanded to include, for instance, students and other colleges, the study would have yielded 
better results and findings. 
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